8 research outputs found
Visualizing the Diversity of Representations Learned by Bayesian Neural Networks
Explainable Artificial Intelligence (XAI) aims to make learning machines less
opaque, and offers researchers and practitioners various tools to reveal the
decision-making strategies of neural networks. In this work, we investigate how
XAI methods can be used for exploring and visualizing the diversity of feature
representations learned by Bayesian Neural Networks (BNNs). Our goal is to
provide a global understanding of BNNs by making their decision-making
strategies a) visible and tangible through feature visualizations and b)
quantitatively measurable with a distance measure learned by contrastive
learning. Our work provides new insights into the \emph{posterior} distribution
in terms of human-understandable feature information with regard to the
underlying decision making strategies. The main findings of our work are the
following: 1) global XAI methods can be applied to explain the diversity of
decision-making strategies of BNN instances, 2) Monte Carlo dropout with
commonly used Dropout rates exhibit increased diversity in feature
representations compared to the multimodal posterior approximation of
MultiSWAG, 3) the diversity of learned feature representations highly
correlates with the uncertainty estimate for the output and 4) the inter-mode
diversity of the multimodal posterior decreases as the network width increases,
while the intra mode diversity increases. These findings are consistent with
the recent Deep Neural Networks theory, providing additional intuitions about
what the theory implies in terms of humanly understandable concepts.Comment: 16 pages, 18 figure
FedZero: Leveraging Renewable Excess Energy in Federated Learning
Federated Learning (FL) is an emerging machine learning technique that
enables distributed model training across data silos or edge devices without
data sharing. Yet, FL inevitably introduces inefficiencies compared to
centralized model training, which will further increase the already high energy
usage and associated carbon emissions of machine learning in the future.
Although the scheduling of workloads based on the availability of low-carbon
energy has received considerable attention in recent years, it has not yet been
investigated in the context of FL. However, FL is a highly promising use case
for carbon-aware computing, as training jobs constitute of energy-intensive
batch processes scheduled in geo-distributed environments.
We propose FedZero, a FL system that operates exclusively on renewable excess
energy and spare capacity of compute infrastructure to effectively reduce the
training's operational carbon emissions to zero. Based on energy and load
forecasts, FedZero leverages the spatio-temporal availability of excess energy
by cherry-picking clients for fast convergence and fair participation. Our
evaluation, based on real solar and load traces, shows that FedZero converges
considerably faster under the mentioned constraints than state-of-the-art
approaches, is highly scalable, and is robust against forecasting errors
Coordinated optimization of visual cortical maps (II) Numerical studies
It is an attractive hypothesis that the spatial structure of visual cortical
architecture can be explained by the coordinated optimization of multiple
visual cortical maps representing orientation preference (OP), ocular dominance
(OD), spatial frequency, or direction preference. In part (I) of this study we
defined a class of analytically tractable coordinated optimization models and
solved representative examples in which a spatially complex organization of the
orientation preference map is induced by inter-map interactions. We found that
attractor solutions near symmetry breaking threshold predict a highly ordered
map layout and require a substantial OD bias for OP pinwheel stabilization.
Here we examine in numerical simulations whether such models exhibit
biologically more realistic spatially irregular solutions at a finite distance
from threshold and when transients towards attractor states are considered. We
also examine whether model behavior qualitatively changes when the spatial
periodicities of the two maps are detuned and when considering more than 2
feature dimensions. Our numerical results support the view that neither minimal
energy states nor intermediate transient states of our coordinated optimization
models successfully explain the spatially irregular architecture of the visual
cortex. We discuss several alternative scenarios and additional factors that
may improve the agreement between model solutions and biological observations.Comment: 55 pages, 11 figures. arXiv admin note: substantial text overlap with
arXiv:1102.335
DORA: Exploring outlier representations in Deep Neural Networks
Deep Neural Networks (DNNs) draw their power from the representations they
learn. In recent years, however, researchers have found that DNNs, while being
incredibly effective in learning complex abstractions, also tend to be infected
with artifacts, such as biases, Clever Hanses (CH), or Backdoors, due to
spurious correlations inherent in the training data. So far, existing methods
for uncovering such artifactual and malicious behavior in trained models focus
on finding artifacts in the input data, which requires both availabilities of a
data set and human intervention. In this paper, we introduce DORA
(Data-agnOstic Representation Analysis): the first automatic data-agnostic
method for the detection of potentially infected representations in Deep Neural
Networks. We further show that contaminated representations found by DORA can
be used to detect infected samples in any given dataset. We qualitatively and
quantitatively evaluate the performance of our proposed method in both,
controlled toy scenarios, and in real-world settings, where we demonstrate the
benefit of DORA in safety-critical applications.Comment: 21 pages, 22 figure